Agenda

  • Status update for the DAUF project
  • New ABM 2025 with overall result and updates
  • New beta version of subject based KTH Research Information app
  • News related to data curation - new version of DiVA coming
  • OpenAlex on Sunet
  • Future directions and your questions and feedback

About the DAUF project

  • Creating services and tools for presentation of research information data, improved data flows and connecting data sources within KTH
  • Agile model with 2 week sprints
  • Collaboration between KTH Library, RSO and ITA
  • Part of IT portfolio for Research (Delportfölj forskning), within in the object “Publicering och analys”

Status and progress update

Progress overview - since last demo

  • This years version of ABM was released about a week ago

  • Recently released beta-version of KTH Research Information app

  • POC for the KTH Indicators dashboard based on consolidated indicators collected from across KTH.

  • Tests and prep for GDP 2.0 (Gemensamma dataprojektet) - new standard for Swedish project data

  • Work to use OpenAlex to update DiVA, and to construct bibliometric database

Annual Bibliometric Monitoring 2025

Changes in ABM 2024

  • More interactive graphs (plotly)
  • Changed OA graph
  • Enabled selection of number of rows for co-publication tables
  • Some cosmetic changes

Brief ABM results for KTH

  • Number of publications seems to have stabilized
  • Citations indicators relatively stable
  • Journal indicators stable but slightly increasing over last 5 years
  • Small changes in co-publication patterns
  • Share of Open Access publications sharply decreasing last year
    • reasons unclear at the moment

Subject based Resarch Information-app

Data curation and DiVA

  • Harvest of DiVA through OAI-PMH –> database
  • Can curate and annotate connected to this database
  • Preperation for new DiVA
  • Ambition to decouple importing and curation from DiVA
  • Preparation to use APIs to communicate with DiVA

DiVA curation and stats

  • some summary data?

DiVA harvesting

The DAUF project now harvests DiVA publication data using the OAI-PMH protocol which regularly updates a single file duckdb database, openly available from object storage:

https://data.bibliometrics.lib.kth.se/kthcorpus/oai.db

The database with the harvested information is currently about 4.4 GB large.It is reqularly updated and contains MODS and JSON representations of “all-kth” DiVA records.

Swedish bibliometric resource based on OpenAlex

X

  • Ambition at KTH to track visions and goals using indicators
  • Project within Strategisk verksamhetsanalys
  • Workshops to align avaliable data with goals
  • Relatively manual data collection process from scattered systems at KTH
  • Indicator report + beta dashboard for testing purposes
  • The intended user group is KTH leadership and it enables comparing indicators across schools

Data Curation

Data infrastructure overview

Object storage (S3)

General Dataflow

+--------------------------------+
|                                |
|          Data Sources          |
|                                |
+--------------------------------+
                 |                
  Clean / Crosscheck / Transform  
                 v                
+--------------------------------+
|                                |
|          Curated Data          |
|                                |
+--------------------------------+
                 |                
           Write / POST           
                 v                
+--------------------------------+
|                                |
|     Object Storage (minio)     |
|                                |
+--------------------------------+
                 |                
            Read / GET            
                 v                
+--------------------------------+
|                                |
|     Data Consumer / Client     |
|                                |
+--------------------------------+

GDP

GDP (Gemensamma data för projekt) is an effort of a number of Swedish research funders to create a common data model for project data. The five funding agencies Energimyndigheten, Formas, Forte, Vetenskapsrådet and Vinnova is developing a standard which enables sharing of open data about fundings and related information.

The standard is developed in cooperation with a reference group including universities and other organisations within the university sector, KTH is a participant in the reference group.

GDP data mobilization

DiVA harvesting

The DAUF project now harvests DiVA publication data using the OAI-PMH protocol which regularly updates a single file duckdb database, openly available from object storage:

https://data.bibliometrics.lib.kth.se/kthcorpus/oai.db

The database with the harvested information is currently about 4.4 GB large.It is reqularly updated and contains MODS and JSON representations of “all-kth” DiVA records.

Future work and discussion

Future work and directions

  • x

Related activities

  • KTH Cris/Rims

  • KTH Insights/datastyrning (MS Fabric/Power BI)

Questions and Answers

Please provide your input in chat or verbally.

  • Questions, suggestions or comments?

If you prefer to give your feedback later or come up with questions after this demo, you are always welcome to email us at biblioteket@kth.se.

Thank you for attending!